---
slug: /ai-agents
description: Learn how to build reproducible AI agents with Dagger using OpenAI, Anthropic, Gemini, Llama, DeepSeek, Qwen, and more.
---

# Build AI agents with Dagger

## Overview

Dagger can be used as a runtime and programming environment for AI agents. Benefits include *reproducible execution*, *end-to-end observability*, *multi-model support*, *rapid iteration*, and *easy integration*.

## Architecture

Dagger's module system allows implementing agentic features as modular components that you can integrate into your application, or use individually.

Each module has the following features:

- Runs in containers, for maximum portability and reproducibility
- Can be run from the command-line, or programmatically via an API
- Generated bindings for Go, Python, Typescript, PHP (experimental), Java (experimental), and Rust (experimental).
- End-to-end tracing of prompts, tool calls, and even low-level system operations. 100% of agent state changes are traced.
- Cross-language extensions. Add your own modules in any language.
- Platform-independent. No infrastructure lock-in! Runs on any hosting provider that can run containers.

Dagger and your IDE are the only dependency for developing and running these modules.
The entire environment is containerized, for maximum portability.

## Community

Building AI agents on Dagger is an exciting new use case, *that still has rough edges*. We strongly recommend [joining our Community discord](https://discord.gg/KK3AfBP8Gw).
The Dagger community is very welcoming, and will be happy to answer your questions, discuss your use case ideas, and help you get started.

Do this now! It will make the rest of the experience more productive, and more fun.

See also [this Twitter thread](https://x.com/solomonstre/status/1891205257516003344) for examples, discussions and demos.

## Examples

Here are several example repositories containing examples of Dagger modules with agentic capabilities, which you can use as inspiration.

- [toy-programmer](https://github.com/shykes/toy-programmer): a very, very simple programmer micro-agent for demo purposes
- [melvin](https://github.com/dagger/agents/tree/main/melvin): Melvin is [Devin](https://devin.ai)'s little cousin 😄. An experimental open-source coding agent, made of small composable modules rather than one monolithic app.
- [multiagent](https://github.com/dagger/agents/tree/main/multiagent-demo): a demo using multiple LLMs to solve a problem
- [github-go-coder](https://github.com/dagger/agents/tree/main/github-go-coder): a Go programmer micro-agent that receives assignments from GitHub issues and creates PRs with it's solutions
- [cypress-test-writer](https://github.com/jpadams/cypress-test-writer): an agent that compares two branches in git for a UI change and creates a Cypress test to cover the change.
- [laravel-assistant](https://github.com/jasonmccallister/laravel-assistant): an agent that reviews git changes in a local Laravel project to generate tests.
- [tictactoe](https://github.com/kpenfound/agents/tree/main/tictactoe): an agent that plays Tic Tac Toe with a human player.
- [dockerfile-optimizer](https://github.com/dagger/agents/tree/main/dockerfile-optimizer): an agent that analyzes your Dockerfile and suggests improvements for better efficiency, security, and best practices.
- [test-debugger](https://github.com/kpenfound/greetings-api/blob/main/DEBUGGER_AGENT.md): an agent that automatically debugs failing tests in CI.
- [technical-content-summarizer](https://github.com/jasonmccallister/technical-content-summarizer): an agent that accepts a URL, optional min and max character length and forbidden words and summarizes the content for a non-technical audience.
- [swe-agent](https://github.com/kpenfound/greetings-api/blob/main/SWE_AGENT.md): an agent that gets assigned GitHub issues and solves them with pull requests.

For more explanation of agent design patterns, see [FAQ](#faq) below.

## Initial setup

### 1. Install Dagger with llm support

*Note: the latest version is `0.17.0-llm.9`. It was released on March 8, 2025. If you are running an older build, we recommend upgrading.*

You will need a *development version* of Dagger which adds native support for LLM prompting and tool calling.

Once this feature is merged (current target is 0.17), a development build will no longer be required.

Install the development version of LLM-enabled Dagger:

```console
curl -fsSL https://dl.dagger.io/dagger/install.sh | DAGGER_VERSION=0.17.0-llm.9 BIN_DIR=/usr/local/bin sh
```

You can adjust `BIN_DIR` to customize where the `dagger` CLI is installed.

Verify that your Dagger installation works:

```console
$ dagger -c version
v0.17.0-llm.9
```

### 2.Configure LLM endpoints

Dagger uses your system's standard environment variables to route LLM requests.
Check the [source code](https://github.com/dagger/dagger/blob/llm/core/llm.go#L250) for the latest, but these are the most-commonly used variables:

#### Anthropic
- `ANTHROPIC_API_KEY`: required
- `ANTHROPIC_MODEL`: if unset, defaults to `"claude-3-5-sonnet-latest"`
- [more model name strings](https://docs.anthropic.com/en/docs/about-claude/models/all-models)

#### OpenAI
- `OPENAI_API_KEY`: required
- `OPENAI_MODEL`: if unset, defaults to `"gpt-4o"`
- **optional extras**
  - `OPENAI_BASE_URL`: for alternative OpenAI-compatible endpoints like Azure or local models
  - `OPENAI_AZURE_VERSION`: for use with [Azure's API to OpenAI](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview). See below.
- [more model name strings](https://platform.openai.com/docs/models) within individual model docs

#### Google Gemini
- `GEMINI_API_KEY`: required
- `GEMINI_MODEL`: default `"gemini-2.0-flash"`
- [more model name strings](https://ai.google.dev/gemini-api/docs/models/gemini)

Dagger will look for these variables in your environment, or in a `.env` file in the current directory (`.env` files in parent directories are not yet supported).

Secrets in the `.env` file can use Dagger Secrets management, for example:
```
# key stored in 1Password
OPENAI_API_KEY="op://Private/OpenAI API Key/password"
```

#### Using with Ollama

You can use Ollama as a local LLM provider. Here's how to set it up:

1. Install Ollama from [ollama.ai](https://ollama.ai)

2. Start Ollama server with host binding:

```shell
OLLAMA_HOST="0.0.0.0:11434" ollama serve
```

3. Get your host machine's IP address:

On macOS / Linux (modern distributions):

```shell
ifconfig | grep "inet " | grep -v 127.0.0.1
```

On Linux (older distributions):

```shell
ip addr | grep "inet " | grep -v 127.0.0.1
```

This step is needed because our LLM type runs inside the engine and needs to reach your local Ollama service. While we're potentially exploring the implementatin of automatic tunneling, for now you'll need to use your machine's actual IP address instead of localhost to allow the containers to communicate with Ollama.

4. Configure the following environment variables (replace `YOUR_IP` with the IP address from step 3):

```plaintext
OPENAI_BASE_URL=http://YOUR_IP:11434/v1/
OPENAI_MODEL=llama3.2
```

For example, if your IP is 192.168.64.1:
```plaintext
OPENAI_BASE_URL=http://192.168.64.1:11434/v1/
OPENAI_MODEL=llama3.2
```

5. Pull some models to your local Ollama service:

```shell
ollama pull llama3.2
```

Note that to successfully give the LLM Dagger's API to use as a tool, the model should support tools.
Here's a list curated by Ollama of models that support tools: [Ollama models supporting tools](https://ollama.com/search?c=tools)

#### Using with Amazon Bedrock (tbd)

#### Using with Azure OpenAI Service

If you're using Azure OpenAI Service, configure your environment with the following variables:

```plaintext
OPENAI_BASE_URL=https://<azure-openai-resource>.cognitiveservices.azure.com
OPENAI_API_KEY=<azure openai deployment api key>
OPENAI_MODEL=<model to use by default>          # example: gpt-4o
OPENAI_AZURE_VERSION=<azure openai api version> # example: 2024-12-01-preview
```

## Run modules from the command-line

Use the `dagger` CLI to load a module and call its functions.

For example, to use the `toy-programmer` agent module:

```console
dagger -m https://github.com/shykes/toy-programmer
```

Then, run this command in the dagger shell:

```dagger
.help
```

This prints available functions. Let's call one:

```dagger
go-program "develop a curl clone" | terminal
```

This calls the `go-program` function with a description of a program to write, then runs the `terminal` function on the returned container.

You can use tab-completion to explore other available functions.

### Integrate Dagger into your application

You can embed Dagger modules into your application.
Supported languages are Python, Typescript, Go, Java, PHP - with more language support under development.

1. Initialize a Dagger module at the root of your application.
This doesn't need to be the root of your git repository - Dagger is monorepo-ready.

```console
dagger init
```

2. Install the modules you wish to load

For example, to install the toy-workspace module:

```console
dagger install github.com/shykes/toy-programmer/toy-workspace
```

3. Install a generated client in your project

*TODO: this feature is not yet merged in a stable version of Dagger*

This will configure Dagger to generate client bindings for the language of your choice.

For example, if your project is a Python application:

```console
dagger client install python
```

4. Re-generate clients

*TODO: this feature is not yet merged in a stable version of Dagger*

Any time you need to re-generate your client, run:

```console
dagger client generate
```

## FAQ

### How does the agentic loop work?

Consider the following code from the [toy-programmer](https://github.com/shykes/toy-programmer):

```go
result := dag.Llm().
		WithToyWorkspace(dag.ToyWorkspace().Write("assignment.txt", assignment)).
		WithPrompt("You are an expert go programmer. You have access to a workspace").
		WithPrompt("Complete the assignment written at assignment.txt").
		WithPrompt("Don't stop until the code builds").
		ToyWorkspace().
		Container()
```

This code creates a new LLM, gives it a workspace (described below) with an assignment, and prompts it to complete the assignment. The LLM then runs in a loop, calling tools and iterating on it's work, until it completes the assignment. This loop all happens inside of the LLM object in this code snippet, so the value of `result` is the completed assignment.

### How do I give the LLM access to tools?

Dagger [modules](features/modules.mdx) are collections of Dagger functions. When you give a module to the LLM object, every function is turned into a tool that the LLM can call.

Consider the following code from the [toy-programmer](https://github.com/shykes/toy-programmer):

```go
result := dag.Llm().
		WithToyWorkspace(dag.ToyWorkspace().Write("assignment.txt", assignment)).
		WithPrompt("You are an expert go programmer. You have access to a workspace").
		WithPrompt("Complete the assignment written at assignment.txt").
		WithPrompt("Don't stop until the code builds").
		ToyWorkspace().
		Container()
```

In this code, an instance of the `ToyWorkspace` module is given to the LLM, so the LLM can call any function in the `ToyWorkspace` module such as read, write, and build.

### What is this "workspace" object that most of the examples use?

A workspace is a common design pattern that a lot of agents implement. It's simply a Dagger module that provides an object with functions to read and write files, run tests, and generally interact with a Directory or Container. The workspace module is passed to the LLM, and the LLM uses it to complete it's task. The reason this is a popular pattern is because the Directory and Container types have a lot of functions and an LLM can easily get lost or provide inconsistent results. A "workspace", or a Dagger module with limited functions like read/write/test, is an easy way to restrict the tools the LLM has access to, and to provide a consistent interface for the LLM to interact with the environment.

For example workspaces, see [toy-workspace](https://github.com/shykes/toy-programmer/tree/main/toy-workspace) or [kpenfound/dag/workspace](https://github.com/kpenfound/dag/tree/main/workspace)

### How I get information or objects out of the LLM?

Consider the following code from the [toy-programmer](https://github.com/shykes/toy-programmer):

```go
result := dag.Llm().
		WithToyWorkspace(dag.ToyWorkspace().Write("assignment.txt", assignment)).
		WithPrompt("You are an expert go programmer. You have access to a workspace").
		WithPrompt("Complete the assignment written at assignment.txt").
		WithPrompt("Don't stop until the code builds").
		ToyWorkspace().
		Container()
```

In this example, the work produced by the LLM is contained in the `ToyWorkspace`, so you want to get that work back out after the LLM is done. The work in `ToyWorkspace` is in it's field `Container`. Because you gave the LLM the workspace using `WithToyWorkspace()`, you can get the resulting toy-workspace and the container within it by chaining `.ToyWorkspace().Container()`. Now `result` is the Container with the completed work.

If you're simply interested in the messaging between the assistant and user, you can use `.History()` to get the full conversation or `.LastReply()` to get the final response.

### My agent isn't behaving properly. How do I debug it?

First, to get more information about the agent's behavior, you can increase the verbosity of the shell or check out the run in Dagger Cloud. This is the best way to see what steps the LLM took, the details of it's thinking between each tool call, and what the tool calls were.

Second, once you understand how the agent is behaving, you can try to adjust the prompt to get the desired behavior. This is often a matter of trial and error, and can be a bit of an art.
